To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs). To be comprehensive, our library covers $13$ common text generation tasks and their corresponding $83$ datasets and further incorporates $45$ PLMs covering general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight PLMs. We also implement $4$ efficient training strategies and provide $4$ generation objectives for pre-training new PLMs from scratch. To be unified, we design the interfaces to support the entire research pipeline (from data loading to training and evaluation), ensuring that each step can be fulfilled in a unified way. Despite the rich functionality, it is easy to use our library, either through the friendly Python API or command line. To validate the effectiveness of our library, we conduct extensive experiments and exemplify four types of research scenarios. The project is released at the link: https://github.com/RUCAIBox/TextBox.
translated by 谷歌翻译
Establishing open and general benchmarks has been a critical driving force behind the success of modern machine learning techniques. As machine learning is being applied to broader domains and tasks, there is a need to establish richer and more diverse benchmarks to better reflect the reality of the application scenarios. Graph learning is an emerging field of machine learning that urgently needs more and better benchmarks. To accommodate the need, we introduce Graph Learning Indexer (GLI), a benchmark curation platform for graph learning. In comparison to existing graph learning benchmark libraries, GLI highlights two novel design objectives. First, GLI is designed to incentivize \emph{dataset contributors}. In particular, we incorporate various measures to minimize the effort of contributing and maintaining a dataset, increase the usability of the contributed dataset, as well as encourage attributions to different contributors of the dataset. Second, GLI is designed to curate a knowledge base, instead of a plain collection, of benchmark datasets. We use multiple sources of meta information to augment the benchmark datasets with \emph{rich characteristics}, so that they can be easily selected and used in downstream research or development. The source code of GLI is available at \url{https://github.com/Graph-Learning-Benchmarks/gli}.
translated by 谷歌翻译
Neural networks, especially the recent proposed neural operator models, are increasingly being used to find the solution operator of differential equations. Compared to traditional numerical solvers, they are much faster and more efficient in practical applications. However, one critical issue is that training neural operator models require large amount of ground truth data, which usually comes from the slow numerical solvers. In this paper, we propose a physics-guided data augmentation (PGDA) method to improve the accuracy and generalization of neural operator models. Training data is augmented naturally through the physical properties of differential equations such as linearity and translation. We demonstrate the advantage of PGDA on a variety of linear differential equations, showing that PGDA can improve the sample complexity and is robust to distributional shift.
translated by 谷歌翻译
Accurate polyp segmentation is of great importance for colorectal cancer diagnosis and treatment. However, due to the high cost of producing accurate mask annotations, existing polyp segmentation methods suffer from severe data shortage and impaired model generalization. Reversely, coarse polyp bounding box annotations are more accessible. Thus, in this paper, we propose a boosted BoxPolyp model to make full use of both accurate mask and extra coarse box annotations. In practice, box annotations are applied to alleviate the over-fitting issue of previous polyp segmentation models, which generate fine-grained polyp area through the iterative boosted segmentation model. To achieve this goal, a fusion filter sampling (FFS) module is firstly proposed to generate pixel-wise pseudo labels from box annotations with less noise, leading to significant performance improvements. Besides, considering the appearance consistency of the same polyp, an image consistency (IC) loss is designed. Such IC loss explicitly narrows the distance between features extracted by two different networks, which improves the robustness of the model. Note that our BoxPolyp is a plug-and-play model, which can be merged into any appealing backbone. Quantitative and qualitative experimental results on five challenging benchmarks confirm that our proposed model outperforms previous state-of-the-art methods by a large margin.
translated by 谷歌翻译
Artificial Intelligence (AI) is having a tremendous impact across most areas of science. Applications of AI in healthcare have the potential to improve our ability to detect, diagnose, prognose, and intervene on human disease. For AI models to be used clinically, they need to be made safe, reproducible and robust, and the underlying software framework must be aware of the particularities (e.g. geometry, physiology, physics) of medical data being processed. This work introduces MONAI, a freely available, community-supported, and consortium-led PyTorch-based framework for deep learning in healthcare. MONAI extends PyTorch to support medical data, with a particular focus on imaging, and provide purpose-specific AI model architectures, transformations and utilities that streamline the development and deployment of medical AI models. MONAI follows best practices for software-development, providing an easy-to-use, robust, well-documented, and well-tested software framework. MONAI preserves the simple, additive, and compositional approach of its underlying PyTorch libraries. MONAI is being used by and receiving contributions from research, clinical and industrial teams from around the world, who are pursuing applications spanning nearly every aspect of healthcare.
translated by 谷歌翻译
在医学图像分析中需要进行几次学习的能力是对支持图像数据的有效利用,该数据被标记为对新类进行分类或细分新类,该任务否则需要更多的培训图像和专家注释。这项工作描述了一种完全3D原型的几种分段算法,因此,训练有素的网络可以有效地适应培训中缺乏的临床有趣结构,仅使用来自不同研究所的几个标记图像。首先,为了弥补机构在新型类别的情节适应中的广泛认识的空间变异性,新型的空间注册机制被整合到原型学习中,由分割头和空间对齐模块组成。其次,为了帮助训练观察到的不完美比对,提出了支持掩模调节模块,以进一步利用支持图像中可用的注释。使用589个骨盆T2加权MR图像的数据集分割了八个对介入计划的解剖结构的应用,该实验是针对介入八个机构的八个解剖结构的应用。结果证明了3D公式中的每种,空间登记和支持掩模条件的功效,所有这些条件都独立或集体地做出了积极的贡献。与先前提出的2D替代方案相比,不管支持数据来自相同还是不同的机构,都具有统计学意义的少量分割性能。
translated by 谷歌翻译
在这项工作中,我们提出了叙述,这是一种新颖的管道,可以以逼真的方式同时编辑肖像照明和观点。作为一种混合神经形态的面部模型,叙述了几何学感知生成方法和正常辅助物理面部模型的互补益处。简而言之,叙述首先将输入肖像转变为粗糙的几何形状,并采用神经渲染来产生类似于输入的图像,并产生令人信服的姿势变化。但是,反演步骤引入了不匹配,带来了较少面部细节的低质量图像。因此,我们进一步估计了师范的肖像,以增强粗糙的几何形状,从而创建高保真的物理面部模型。特别是,我们融合了神经和身体渲染,以补偿不完善的反转,从而产生了现实和视图一致的新颖透视图像。在重新阶段,以前的作品着重于单一视图肖像重新审议,但也忽略了不同观点之间的一致性,引导不稳定和不一致的照明效果以进行视图变化。我们通过将其多视图输入正常地图与物理面部模型统一,以解决此问题。叙事通过一致的正常地图进行重新进行重新,施加了跨视图的约束并表现出稳定且连贯的照明效果。我们在实验上证明,叙述在先前的工作中取得了更现实的,可靠的结果。我们进一步使用动画和样式转移工具进行介绍,从而分别或组合姿势变化,灯光变化,面部动画和样式转移,所有这些都以摄影质量为单位。我们展示了生动的自由视图面部动画以及3D感知可靠的风格化,可帮助促进各种AR/VR应用程序,例如虚拟摄影,3D视频会议和后期制作。
translated by 谷歌翻译
在没有高保真模拟环境的情况下,学习有效的加强学习(RL)政策可以解决现实世界中的复杂任务。在大多数情况下,我们只有具有简化动力学的不完善的模拟器,这不可避免地导致RL策略学习中的SIM到巨大差距。最近出现的离线RL领域为直接从预先收集的历史数据中学习政策提供了另一种可能性。但是,为了达到合理的性能,现有的离线RL算法需要不切实际的离线数据,并具有足够的州行动空间覆盖范围进行培训。这提出了一个新问题:是否有可能通过在线RL中的不完美模拟器中的离线RL中的有限数据中的学习结合到无限制的探索,以解决两种方法的缺点?在这项研究中,我们提出了动态感知的混合离线和对线增强学习(H2O)框架,以为这个问题提供肯定的答案。 H2O引入了动态感知的政策评估方案,该方案可以自适应地惩罚Q函数在模拟的状态行动对上具有较大的动态差距,同时也允许从固定的现实世界数据集中学习。通过广泛的模拟和现实世界任务以及理论分析,我们证明了H2O与其他跨域在线和离线RL算法相对于其他跨域的表现。 H2O提供了全新的脱机脱机RL范式,该范式可能会阐明未来的RL算法设计,以解决实用的现实世界任务。
translated by 谷歌翻译
大肠息肉分类是一项关键的临床检查。为了提高分类精度,大多数计算机辅助诊断算法通过采用窄带成像(NBI)识别结直肠息肉。但是,NBI通常在实际诊所场景中缺少利用率,因为该特定图像的获取需要在使用白光(WL)图像检测到息肉时手动切换光模式。为了避免上述情况,我们提出了一种新的方法,可以通过进行结构化的跨模式表示一致性直接实现准确的白光结肠镜图像分类。实际上,一对多模式图像,即NBI和WL,被送入共享变压器中以提取分层特征表示。然后,采用了一种新颖的设计空间注意模块(SAM)来计算从多层次的类令牌和贴片令牌%的相似性,以获得特定模态图像。通过将配对NBI和WL图像的类令牌和空间注意图对齐,变压器可以使上述两种模式保持全局和局部表示一致性。广泛的实验结果说明了所提出的方法的表现优于最近的研究,从而通过单个变压器实现了多模式预测,同时仅在使用WL图像时大大提高了分类精度。
translated by 谷歌翻译
从组织学图像开发AI辅助腺体分割方法对于自动癌症诊断和预后至关重要。但是,像素级注释的高成本阻碍了其对更广泛的疾病的应用。计算机视觉中现有的弱监督语义分割方法获得了腺体分割的退化结果,因为腺体数据集的特征和问题与一般对象数据集不同。我们观察到,与自然图像不同,组织学图像的关键问题是,在不同组织之间拥有阶级与形态同质性和低色对比的混淆。为此,我们提出了一种新颖的在线方法简单的示例采矿(OEEM),该方法鼓励网络专注于可靠的监督信号,而不是嘈杂的信号,因此减轻了伪掩模中不可避免的错误预测的影响。根据腺数据集的特征,我们为腺体分割设计了强大的框架。我们的结果分别超过了MIOU的许多完全监督的方法和弱监督的方法,用于腺体分割超过4.4%和6.04%。代码可从https://github.com/xmed-lab/oeem获得。
translated by 谷歌翻译